From text to formants - indirect model for trajectory prediction based on a multi-speaker parallel speech database

نویسندگان

  • Kálmán Abari
  • Tamás Gábor Csapó
  • Bálint Tóth
  • Gábor Olaszy
چکیده

An indirect model is presented, capable of estimating formant trajectories from text only (Text-to-Formants, TTF). The result is a phonetically correct formant trajectory flow of any virtual speech signal, i.e. one that has never been uttered. The focus is on the pattern forms inside the given sound, taking into account the sound environment (up to quinphone), and not on individual formant value measurements. The model is based on a multi-speaker parallel speech database with precise manual corrections and a HMM-based formant trajectory predictor. The validation of the TTF model shows that formant trajectories can be predicted with good accuracy from text. The model indirectly gives information about a theoretically possible articulation flow of the sentence. Thus it gives a general ‘formantprint’ of the language.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Comparative Study of Gender and Age Classification in Speech Signals

Accurate gender classification is useful in speech and speaker recognition as well as speech emotion classification, because a better performance has been reported when separate acoustic models are employed for males and females. Gender classification is also apparent in face recognition, video summarization, human-robot interaction, etc. Although gender classification is rather mature in a...

متن کامل

Formant Tracking Linear Prediction Model using HMMs for Noisy Speech Processing

This paper presents a formant-tracking linear prediction (FTLP) model for speech processing in noise. The main focus of this work is the detection of formant trajectory based on Hidden Markov Models (HMM), for improved formant estimation in noise. The approach proposed in this paper provides a systematic framework for modelling and utilization of a timesequence of peaks which satisfies continui...

متن کامل

The Use of Group Delay Features of Linear Prediction Model for Speaker Recognition

New text independent speaker identification method is presented. Phase spectrum of allpole linear prediction (LP) model is used to derive the speech features. The features are represented by pairs of numbers that are calculated from group delay extremums of LP model spectrum. The first component of the pair is an argument of maximum of group delay of all pole LP model spectrum and the second is...

متن کامل

Recognizing the Emotional State Changes in Human Utterance by a Learning Statistical Method based on Gaussian Mixture Model

Speech is one of the most opulent and instant methods to express emotional characteristics of human beings, which conveys the cognitive and semantic concepts among humans. In this study, a statistical-based method for emotional recognition of speech signals is proposed, and a learning approach is introduced, which is based on the statistical model to classify internal feelings of the utterance....

متن کامل

Comparative Analysis of Formants of British, American and Asutralian Accents

This paper compares and quantifies the differences between formants of speech across accents. The cross entropy information measure is used to compare the differences between the formants of the vowels of three major English accents namely British, American and Australian. An improved formant estimation method, based on a linear prediction (LP) model feature analysis and a hidden Markov model (...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015